AITopics | second term

Denoising diffusion models have evolved into a state-of-the-art method for tasks in various fields, such as denoising and generation of images, text generation, or generation of synthetic data for training of other machine learning models. First hitting diffusion models (FHDM) are a particular class of denoising diffusion models with \textit{random} adaptive generation time tailored to generate data on a known manifold. Building on the conditioning framework of Doob's $h$-transform these models leverage the given information on the target data manifold to demonstrate strong performance across tasks while offering distinct features such as time-homogeneous dynamics of the generating process and a reduced average simulation time. Even though the theoretical investigation of standard forward-backward diffusion models has attracted much attention in the recent past, the statistical convergence properties of FHDMs are not yet understood. In this work, we show that, up to logarithmic factors, FHDMs achieve the minimax optimal convergence rate in total variation for spherically supported Sobolev smooth data distributions. In particular, this is the first statistical optimality result for denoising diffusion modelling with random generation time.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2605.07625

Country: Europe > Austria (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

fc30caeb45721bab13507c50199e6403-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:53:23 GMT

artificial intelligence, machine learning, probability assignment, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

bc5fcb0018cecacba559dc512740091b-Supplemental.pdf

Neural Information Processing SystemsApr-26-2026, 20:34:16 GMT

artificial intelligence, equation, machine learning, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

3c63ec7be1b6c49e6c308397023fd8cd-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 12:57:00 GMT

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

26300457961c3e056ea61c9d3ebec2a4-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 03:47:33 GMT

artificial intelligence, domain adaptation, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

240ac9371ec2671ae99847c3ae2e6384-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 03:29:13 GMT

artificial intelligence, exploitation phase, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

1c71cd4032da425409d8ada8727bad42-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 23:09:19 GMT

We can see that the error for the first term is mainly due to the sample approximation. We therefore refer to the first term as the Variance. We refer to the second term as the Bias. Our proof of convergence of the bias adapts the proof in [31, Theorem 6] and [11], and utilizes the fact that CY|X is Hilbert-Schmidt to obtain a sharp rate. A.1 Bounding the Bias In this section, we establish the bound on the bias.

artificial intelligence, conditional distribution, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Supplementary material for Discrete Valued Neural Communication in Structured Architectures Enhances Generalization

Neural Information Processing SystemsApr-24-2026, 18:11:44 GMT

In this appendix, as a complementary to Theorems 1-2, we provide additional theorems, Theorems 3-4, which further illustrate the two advantages of the discretization process by considering an abstract model with the discretization bottleneck. For the advantage on the sensitivity, the error due to potential noise and perturbation without discretization -- the third term ξ(w,r0,M0,d) >0 in Theorem 4 -- is shown to be minimized to zero with discretization in Theorems 3. See Appendix C.1 for a simple comparison between the bound of Theorem 3 and that of Theorem 4 when the metric spaces (M,d) and (M0,d0) are chosen to be Euclidean spaces. We now introduce the notation used in Theorems 3-4. Here, ϕw represents a deep neural network with weight parameters w W RD, qe is the discretization process with the codebook e E RL m, and hθ represents a deep neural network with parameters θ Θ Rζ. Thus, the tuple of all learnable parameters are (w,e,θ).

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Derivations of Formulas

Neural Information Processing SystemsApr-24-2026, 15:17:41 GMT

We have omitted a number of complicated formulas in the main text to provide clear intuition and concise proof sketch. We will list all mentioned formulas here for readers' reference. We consider the case where U = V = Aand Σ is symmetric and full-rank, and we use gradient flow. We can derive the dynamics of S = AA>as S:= (Σ S)S+ S(Σ S), which is a quadratic ordinary differential equation and it is hard to solve directly. For simplicity, define X:= X Σ 1. Then X = XΣ ΣX. (24) Solving this equation and we have And it is interesting to verify that S(t) + P(t) Σ by using the following lemma.

artificial intelligence, eigenvalue, equation, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback